Overview

Dataset statistics

Number of variables50
Number of observations101766
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory38.8 MiB
Average record size in memory400.0 B

Variable types

Numeric13
Categorical34
Boolean3

Warnings

examide has constant value "False" Constant
citoglipton has constant value "False" Constant
medical_specialty has a high cardinality: 73 distinct values High cardinality
diag_1 has a high cardinality: 717 distinct values High cardinality
diag_2 has a high cardinality: 749 distinct values High cardinality
diag_3 has a high cardinality: 790 distinct values High cardinality
encounter_id is highly correlated with patient_nbrHigh correlation
patient_nbr is highly correlated with encounter_idHigh correlation
encounter_id is highly correlated with patient_nbrHigh correlation
patient_nbr is highly correlated with encounter_idHigh correlation
encounter_id is highly correlated with patient_nbr and 1 other fieldsHigh correlation
metformin-pioglitazone is highly correlated with glimepiride-pioglitazone and 2 other fieldsHigh correlation
change is highly correlated with insulin and 1 other fieldsHigh correlation
patient_nbr is highly correlated with encounter_idHigh correlation
age is highly correlated with medical_specialtyHigh correlation
payer_code is highly correlated with encounter_idHigh correlation
admission_type_id is highly correlated with admission_source_id and 2 other fieldsHigh correlation
insulin is highly correlated with change and 1 other fieldsHigh correlation
glimepiride-pioglitazone is highly correlated with metformin-pioglitazone and 2 other fieldsHigh correlation
admission_source_id is highly correlated with admission_type_id and 1 other fieldsHigh correlation
metformin-rosiglitazone is highly correlated with metformin-pioglitazone and 2 other fieldsHigh correlation
max_glu_serum is highly correlated with admission_type_id and 1 other fieldsHigh correlation
diabetesMed is highly correlated with change and 1 other fieldsHigh correlation
medical_specialty is highly correlated with age and 1 other fieldsHigh correlation
acetohexamide is highly correlated with metformin-pioglitazone and 2 other fieldsHigh correlation
metformin-pioglitazone is highly correlated with examide and 1 other fieldsHigh correlation
acarbose is highly correlated with examide and 1 other fieldsHigh correlation
race is highly correlated with examide and 1 other fieldsHigh correlation
change is highly correlated with examide and 3 other fieldsHigh correlation
chlorpropamide is highly correlated with examide and 1 other fieldsHigh correlation
glyburide-metformin is highly correlated with examide and 1 other fieldsHigh correlation
nateglinide is highly correlated with examide and 1 other fieldsHigh correlation
glipizide-metformin is highly correlated with examide and 1 other fieldsHigh correlation
age is highly correlated with examide and 1 other fieldsHigh correlation
glyburide is highly correlated with examide and 1 other fieldsHigh correlation
miglitol is highly correlated with examide and 1 other fieldsHigh correlation
troglitazone is highly correlated with examide and 1 other fieldsHigh correlation
examide is highly correlated with metformin-pioglitazone and 32 other fieldsHigh correlation
weight is highly correlated with examide and 1 other fieldsHigh correlation
metformin is highly correlated with examide and 1 other fieldsHigh correlation
payer_code is highly correlated with examide and 1 other fieldsHigh correlation
gender is highly correlated with examide and 1 other fieldsHigh correlation
tolbutamide is highly correlated with examide and 1 other fieldsHigh correlation
A1Cresult is highly correlated with examide and 1 other fieldsHigh correlation
glipizide is highly correlated with examide and 1 other fieldsHigh correlation
insulin is highly correlated with change and 3 other fieldsHigh correlation
glimepiride-pioglitazone is highly correlated with examide and 1 other fieldsHigh correlation
metformin-rosiglitazone is highly correlated with examide and 1 other fieldsHigh correlation
tolazamide is highly correlated with examide and 1 other fieldsHigh correlation
glimepiride is highly correlated with examide and 1 other fieldsHigh correlation
pioglitazone is highly correlated with examide and 1 other fieldsHigh correlation
max_glu_serum is highly correlated with examide and 1 other fieldsHigh correlation
citoglipton is highly correlated with metformin-pioglitazone and 32 other fieldsHigh correlation
diabetesMed is highly correlated with change and 3 other fieldsHigh correlation
readmitted is highly correlated with examide and 1 other fieldsHigh correlation
medical_specialty is highly correlated with examide and 1 other fieldsHigh correlation
repaglinide is highly correlated with examide and 1 other fieldsHigh correlation
acetohexamide is highly correlated with examide and 1 other fieldsHigh correlation
rosiglitazone is highly correlated with examide and 1 other fieldsHigh correlation
number_emergency is highly skewed (γ1 = 22.85558215) Skewed
encounter_id has unique values Unique
num_procedures has 46652 (45.8%) zeros Zeros
number_outpatient has 85027 (83.6%) zeros Zeros
number_emergency has 90383 (88.8%) zeros Zeros
number_inpatient has 67630 (66.5%) zeros Zeros

Reproduction

Analysis started2021-08-29 09:52:34.524765
Analysis finished2021-08-29 09:55:03.730245
Duration2 minutes and 29.21 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

encounter_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct101766
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165201645.6
Minimum12522
Maximum443867222
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:04.013287image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum12522
5-th percentile27170784
Q184961194
median152388987
Q3230270887.5
95-th percentile378962843
Maximum443867222
Range443854700
Interquartile range (IQR)145309693.5

Descriptive statistics

Standard deviation102640296
Coefficient of variation (CV)0.6213031087
Kurtosis-0.1020713932
Mean165201645.6
Median Absolute Deviation (MAD)70921143
Skewness0.6991415513
Sum1.681191067 × 1013
Variance1.053503036 × 1016
MonotonicityNot monotonic
2021-08-29T15:25:04.299190image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
778567681
 
< 0.1%
2791888201
 
< 0.1%
969281161
 
< 0.1%
1177572601
 
< 0.1%
1667822821
 
< 0.1%
577344721
 
< 0.1%
2990878881
 
< 0.1%
1586353321
 
< 0.1%
1956303541
 
< 0.1%
837231061
 
< 0.1%
Other values (101756)101756
> 99.9%
ValueCountFrequency (%)
125221
< 0.1%
157381
< 0.1%
166801
< 0.1%
282361
< 0.1%
357541
< 0.1%
369001
< 0.1%
409261
< 0.1%
425701
< 0.1%
558421
< 0.1%
622561
< 0.1%
ValueCountFrequency (%)
4438672221
< 0.1%
4438571661
< 0.1%
4438541481
< 0.1%
4438477821
< 0.1%
4438475481
< 0.1%
4438471761
< 0.1%
4438427781
< 0.1%
4438423401
< 0.1%
4438421361
< 0.1%
4438420701
< 0.1%

patient_nbr
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct71518
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54330400.69
Minimum135
Maximum189502619
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:04.578259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum135
5-th percentile1456971.75
Q123413221
median45505143
Q387545949.75
95-th percentile111480273
Maximum189502619
Range189502484
Interquartile range (IQR)64132728.75

Descriptive statistics

Standard deviation38696359.35
Coefficient of variation (CV)0.7122413759
Kurtosis-0.3473720444
Mean54330400.69
Median Absolute Deviation (MAD)32950134
Skewness0.4712807224
Sum5.528987557 × 1012
Variance1.497408227 × 1015
MonotonicityNot monotonic
2021-08-29T15:25:04.787800image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8878589140
 
< 0.1%
4314090628
 
< 0.1%
8822754023
 
< 0.1%
166029323
 
< 0.1%
2319902123
 
< 0.1%
2364340522
 
< 0.1%
8442861322
 
< 0.1%
9270935121
 
< 0.1%
9060980420
 
< 0.1%
8878970720
 
< 0.1%
Other values (71508)101524
99.8%
ValueCountFrequency (%)
1352
 
< 0.1%
3781
 
< 0.1%
7291
 
< 0.1%
7741
 
< 0.1%
9271
 
< 0.1%
11525
< 0.1%
13051
 
< 0.1%
13143
< 0.1%
16291
 
< 0.1%
20251
 
< 0.1%
ValueCountFrequency (%)
1895026191
< 0.1%
1894814781
< 0.1%
1894451271
< 0.1%
1893658641
< 0.1%
1893510951
< 0.1%
1893494301
< 0.1%
1893320871
< 0.1%
1892988771
< 0.1%
1892578462
< 0.1%
1892157621
< 0.1%

race
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
Caucasian
76099 
AfricanAmerican
19210 
?
 
2273
Hispanic
 
2037
Other
 
1506

Length

Max length15
Median length9
Mean length9.849507694
Min length1

Characters and Unicode

Total characters1002345
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowCaucasian
3rd rowAfricanAmerican
4th rowCaucasian
5th rowCaucasian

Common Values

ValueCountFrequency (%)
Caucasian76099
74.8%
AfricanAmerican19210
 
18.9%
?2273
 
2.2%
Hispanic2037
 
2.0%
Other1506
 
1.5%
Asian641
 
0.6%

Length

2021-08-29T15:25:05.369035image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:05.500080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
caucasian76099
74.8%
africanamerican19210
 
18.9%
2273
 
2.2%
hispanic2037
 
2.0%
other1506
 
1.5%
asian641
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a269395
26.9%
i119234
11.9%
n117197
11.7%
c116556
11.6%
s78777
 
7.9%
C76099
 
7.6%
u76099
 
7.6%
r39926
 
4.0%
A39061
 
3.9%
e20716
 
2.1%
Other values (8)49285
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter881369
87.9%
Uppercase Letter118703
 
11.8%
Other Punctuation2273
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a269395
30.6%
i119234
13.5%
n117197
13.3%
c116556
13.2%
s78777
 
8.9%
u76099
 
8.6%
r39926
 
4.5%
e20716
 
2.4%
f19210
 
2.2%
m19210
 
2.2%
Other values (3)5049
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
C76099
64.1%
A39061
32.9%
H2037
 
1.7%
O1506
 
1.3%
Other Punctuation
ValueCountFrequency (%)
?2273
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1000072
99.8%
Common2273
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a269395
26.9%
i119234
11.9%
n117197
11.7%
c116556
11.7%
s78777
 
7.9%
C76099
 
7.6%
u76099
 
7.6%
r39926
 
4.0%
A39061
 
3.9%
e20716
 
2.1%
Other values (7)47012
 
4.7%
Common
ValueCountFrequency (%)
?2273
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1002345
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a269395
26.9%
i119234
11.9%
n117197
11.7%
c116556
11.6%
s78777
 
7.9%
C76099
 
7.6%
u76099
 
7.6%
r39926
 
4.0%
A39061
 
3.9%
e20716
 
2.1%
Other values (8)49285
 
4.9%

gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
Female
54708 
Male
47055 
Unknown/Invalid
 
3

Length

Max length15
Median length6
Mean length5.075496728
Min length4

Characters and Unicode

Total characters516513
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Female54708
53.8%
Male47055
46.2%
Unknown/Invalid3
 
< 0.1%

Length

2021-08-29T15:25:05.885118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:06.001007image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
female54708
53.8%
male47055
46.2%
unknown/invalid3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e156471
30.3%
a101766
19.7%
l101766
19.7%
F54708
 
10.6%
m54708
 
10.6%
M47055
 
9.1%
n12
 
< 0.1%
U3
 
< 0.1%
k3
 
< 0.1%
o3
 
< 0.1%
Other values (6)18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter414741
80.3%
Uppercase Letter101769
 
19.7%
Other Punctuation3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e156471
37.7%
a101766
24.5%
l101766
24.5%
m54708
 
13.2%
n12
 
< 0.1%
k3
 
< 0.1%
o3
 
< 0.1%
w3
 
< 0.1%
v3
 
< 0.1%
i3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
F54708
53.8%
M47055
46.2%
U3
 
< 0.1%
I3
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin516510
> 99.9%
Common3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e156471
30.3%
a101766
19.7%
l101766
19.7%
F54708
 
10.6%
m54708
 
10.6%
M47055
 
9.1%
n12
 
< 0.1%
U3
 
< 0.1%
k3
 
< 0.1%
o3
 
< 0.1%
Other values (5)15
 
< 0.1%
Common
ValueCountFrequency (%)
/3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII516513
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e156471
30.3%
a101766
19.7%
l101766
19.7%
F54708
 
10.6%
m54708
 
10.6%
M47055
 
9.1%
n12
 
< 0.1%
U3
 
< 0.1%
k3
 
< 0.1%
o3
 
< 0.1%
Other values (6)18
 
< 0.1%

age
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
[70-80)
26068 
[60-70)
22483 
[50-60)
17256 
[80-90)
17197 
[40-50)
9685 
Other values (5)
9077 

Length

Max length8
Median length7
Mean length7.025863255
Min length6

Characters and Unicode

Total characters714994
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[0-10)
2nd row[10-20)
3rd row[20-30)
4th row[30-40)
5th row[40-50)

Common Values

ValueCountFrequency (%)
[70-80)26068
25.6%
[60-70)22483
22.1%
[50-60)17256
17.0%
[80-90)17197
16.9%
[40-50)9685
 
9.5%
[30-40)3775
 
3.7%
[90-100)2793
 
2.7%
[20-30)1657
 
1.6%
[10-20)691
 
0.7%
[0-10)161
 
0.2%

Length

2021-08-29T15:25:06.307396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:06.448746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
70-8026068
25.6%
60-7022483
22.1%
50-6017256
17.0%
80-9017197
16.9%
40-509685
 
9.5%
30-403775
 
3.7%
90-1002793
 
2.7%
20-301657
 
1.6%
10-20691
 
0.7%
0-10161
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0206325
28.9%
[101766
14.2%
-101766
14.2%
)101766
14.2%
748551
 
6.8%
843265
 
6.1%
639739
 
5.6%
526941
 
3.8%
919990
 
2.8%
413460
 
1.9%
Other values (3)11425
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number409696
57.3%
Open Punctuation101766
 
14.2%
Dash Punctuation101766
 
14.2%
Close Punctuation101766
 
14.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0206325
50.4%
748551
 
11.9%
843265
 
10.6%
639739
 
9.7%
526941
 
6.6%
919990
 
4.9%
413460
 
3.3%
35432
 
1.3%
13645
 
0.9%
22348
 
0.6%
Open Punctuation
ValueCountFrequency (%)
[101766
100.0%
Dash Punctuation
ValueCountFrequency (%)
-101766
100.0%
Close Punctuation
ValueCountFrequency (%)
)101766
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common714994
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0206325
28.9%
[101766
14.2%
-101766
14.2%
)101766
14.2%
748551
 
6.8%
843265
 
6.1%
639739
 
5.6%
526941
 
3.8%
919990
 
2.8%
413460
 
1.9%
Other values (3)11425
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII714994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0206325
28.9%
[101766
14.2%
-101766
14.2%
)101766
14.2%
748551
 
6.8%
843265
 
6.1%
639739
 
5.6%
526941
 
3.8%
919990
 
2.8%
413460
 
1.9%
Other values (3)11425
 
1.6%

weight
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
?
98569 
[75-100)
 
1336
[50-75)
 
897
[100-125)
 
625
[125-150)
 
145
Other values (5)
 
194

Length

Max length9
Median length1
Mean length1.217096083
Min length1

Characters and Unicode

Total characters123859
Distinct characters10
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row?
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?98569
96.9%
[75-100)1336
 
1.3%
[50-75)897
 
0.9%
[100-125)625
 
0.6%
[125-150)145
 
0.1%
[25-50)97
 
0.1%
[0-25)48
 
< 0.1%
[150-175)35
 
< 0.1%
[175-200)11
 
< 0.1%
>2003
 
< 0.1%

Length

2021-08-29T15:25:06.793735image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:06.968932image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
98569
96.9%
75-1001336
 
1.3%
50-75897
 
0.9%
100-125625
 
0.6%
125-150145
 
0.1%
25-5097
 
0.1%
0-2548
 
< 0.1%
150-17535
 
< 0.1%
175-20011
 
< 0.1%
2003
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
?98569
79.6%
05172
 
4.2%
54368
 
3.5%
[3194
 
2.6%
-3194
 
2.6%
)3194
 
2.6%
12957
 
2.4%
72279
 
1.8%
2929
 
0.8%
>3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation98569
79.6%
Decimal Number15705
 
12.7%
Open Punctuation3194
 
2.6%
Dash Punctuation3194
 
2.6%
Close Punctuation3194
 
2.6%
Math Symbol3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05172
32.9%
54368
27.8%
12957
18.8%
72279
14.5%
2929
 
5.9%
Other Punctuation
ValueCountFrequency (%)
?98569
100.0%
Open Punctuation
ValueCountFrequency (%)
[3194
100.0%
Dash Punctuation
ValueCountFrequency (%)
-3194
100.0%
Close Punctuation
ValueCountFrequency (%)
)3194
100.0%
Math Symbol
ValueCountFrequency (%)
>3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common123859
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
?98569
79.6%
05172
 
4.2%
54368
 
3.5%
[3194
 
2.6%
-3194
 
2.6%
)3194
 
2.6%
12957
 
2.4%
72279
 
1.8%
2929
 
0.8%
>3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII123859
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
?98569
79.6%
05172
 
4.2%
54368
 
3.5%
[3194
 
2.6%
-3194
 
2.6%
)3194
 
2.6%
12957
 
2.4%
72279
 
1.8%
2929
 
0.8%
>3
 
< 0.1%

admission_type_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.024006053
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:07.122652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.44540283
Coefficient of variation (CV)0.7141296972
Kurtosis1.942476114
Mean2.024006053
Median Absolute Deviation (MAD)0
Skewness1.591984327
Sum205975
Variance2.08918934
MonotonicityNot monotonic
2021-08-29T15:25:07.263653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
153990
53.1%
318869
 
18.5%
218480
 
18.2%
65291
 
5.2%
54785
 
4.7%
8320
 
0.3%
721
 
< 0.1%
410
 
< 0.1%
ValueCountFrequency (%)
153990
53.1%
218480
 
18.2%
318869
 
18.5%
410
 
< 0.1%
54785
 
4.7%
65291
 
5.2%
721
 
< 0.1%
8320
 
0.3%
ValueCountFrequency (%)
8320
 
0.3%
721
 
< 0.1%
65291
 
5.2%
54785
 
4.7%
410
 
< 0.1%
318869
 
18.5%
218480
 
18.2%
153990
53.1%

discharge_disposition_id
Real number (ℝ≥0)

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.715641766
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:07.439639image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile18
Maximum28
Range27
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.280165509
Coefficient of variation (CV)1.421064204
Kurtosis6.003346764
Mean3.715641766
Median Absolute Deviation (MAD)0
Skewness2.563066993
Sum378126
Variance27.88014781
MonotonicityNot monotonic
2021-08-29T15:25:07.651132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
160234
59.2%
313954
 
13.7%
612902
 
12.7%
183691
 
3.6%
22128
 
2.1%
221993
 
2.0%
111642
 
1.6%
51184
 
1.2%
25989
 
1.0%
4815
 
0.8%
Other values (16)2234
 
2.2%
ValueCountFrequency (%)
160234
59.2%
22128
 
2.1%
313954
 
13.7%
4815
 
0.8%
51184
 
1.2%
612902
 
12.7%
7623
 
0.6%
8108
 
0.1%
921
 
< 0.1%
106
 
< 0.1%
ValueCountFrequency (%)
28139
 
0.1%
275
 
< 0.1%
25989
 
1.0%
2448
 
< 0.1%
23412
 
0.4%
221993
2.0%
202
 
< 0.1%
198
 
< 0.1%
183691
3.6%
1714
 
< 0.1%

admission_source_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.754436649
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:07.809976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median7
Q37
95-th percentile17
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.064080834
Coefficient of variation (CV)0.7062517293
Kurtosis1.744989372
Mean5.754436649
Median Absolute Deviation (MAD)0
Skewness1.029934878
Sum585606
Variance16.51675303
MonotonicityNot monotonic
2021-08-29T15:25:08.019324image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
757494
56.5%
129565
29.1%
176781
 
6.7%
43187
 
3.1%
62264
 
2.2%
21104
 
1.1%
5855
 
0.8%
3187
 
0.2%
20161
 
0.2%
9125
 
0.1%
Other values (7)43
 
< 0.1%
ValueCountFrequency (%)
129565
29.1%
21104
 
1.1%
3187
 
0.2%
43187
 
3.1%
5855
 
0.8%
62264
 
2.2%
757494
56.5%
816
 
< 0.1%
9125
 
0.1%
108
 
< 0.1%
ValueCountFrequency (%)
252
 
< 0.1%
2212
 
< 0.1%
20161
 
0.2%
176781
6.7%
142
 
< 0.1%
131
 
< 0.1%
112
 
< 0.1%
108
 
< 0.1%
9125
 
0.1%
816
 
< 0.1%

time_in_hospital
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.395986872
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:08.219134image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.985107767
Coefficient of variation (CV)0.6790529304
Kurtosis0.8502508405
Mean4.395986872
Median Absolute Deviation (MAD)2
Skewness1.133998719
Sum447362
Variance8.910868383
MonotonicityNot monotonic
2021-08-29T15:25:08.395098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
317756
17.4%
217224
16.9%
114208
14.0%
413924
13.7%
59966
9.8%
67539
7.4%
75859
 
5.8%
84391
 
4.3%
93002
 
2.9%
102342
 
2.3%
Other values (4)5555
 
5.5%
ValueCountFrequency (%)
114208
14.0%
217224
16.9%
317756
17.4%
413924
13.7%
59966
9.8%
67539
7.4%
75859
 
5.8%
84391
 
4.3%
93002
 
2.9%
102342
 
2.3%
ValueCountFrequency (%)
141042
 
1.0%
131210
 
1.2%
121448
 
1.4%
111855
 
1.8%
102342
 
2.3%
93002
 
2.9%
84391
4.3%
75859
5.8%
67539
7.4%
59966
9.8%

payer_code
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
?
40256 
MC
32439 
HM
6274 
SP
5007 
BC
4655 
Other values (13)
13135 

Length

Max length2
Median length2
Mean length1.60442584
Min length1

Characters and Unicode

Total characters163276
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row?
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?40256
39.6%
MC32439
31.9%
HM6274
 
6.2%
SP5007
 
4.9%
BC4655
 
4.6%
MD3532
 
3.5%
CP2533
 
2.5%
UN2448
 
2.4%
CM1937
 
1.9%
OG1033
 
1.0%
Other values (8)1652
 
1.6%

Length

2021-08-29T15:25:08.763318image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
40256
39.6%
mc32439
31.9%
hm6274
 
6.2%
sp5007
 
4.9%
bc4655
 
4.6%
md3532
 
3.5%
cp2533
 
2.5%
un2448
 
2.4%
cm1937
 
1.9%
og1033
 
1.0%
Other values (8)1652
 
1.6%

Most occurring characters

ValueCountFrequency (%)
M44810
27.4%
C41845
25.6%
?40256
24.7%
P8211
 
5.0%
H6420
 
3.9%
S5062
 
3.1%
B4655
 
2.9%
D4081
 
2.5%
U2448
 
1.5%
N2448
 
1.5%
Other values (7)3040
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter123020
75.3%
Other Punctuation40256
 
24.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M44810
36.4%
C41845
34.0%
P8211
 
6.7%
H6420
 
5.2%
S5062
 
4.1%
B4655
 
3.8%
D4081
 
3.3%
U2448
 
2.0%
N2448
 
2.0%
O1720
 
1.4%
Other values (6)1320
 
1.1%
Other Punctuation
ValueCountFrequency (%)
?40256
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin123020
75.3%
Common40256
 
24.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
M44810
36.4%
C41845
34.0%
P8211
 
6.7%
H6420
 
5.2%
S5062
 
4.1%
B4655
 
3.8%
D4081
 
3.3%
U2448
 
2.0%
N2448
 
2.0%
O1720
 
1.4%
Other values (6)1320
 
1.1%
Common
ValueCountFrequency (%)
?40256
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII163276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M44810
27.4%
C41845
25.6%
?40256
24.7%
P8211
 
5.0%
H6420
 
3.9%
S5062
 
3.1%
B4655
 
2.9%
D4081
 
2.5%
U2448
 
1.5%
N2448
 
1.5%
Other values (7)3040
 
1.9%

medical_specialty
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
?
49949 
InternalMedicine
14635 
Emergency/Trauma
7565 
Family/GeneralPractice
7440 
Cardiology
5352 
Other values (68)
16825 

Length

Max length36
Median length8
Mean length8.612670243
Min length1

Characters and Unicode

Total characters876477
Distinct characters44
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowPediatrics-Endocrinology
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
?49949
49.1%
InternalMedicine14635
 
14.4%
Emergency/Trauma7565
 
7.4%
Family/GeneralPractice7440
 
7.3%
Cardiology5352
 
5.3%
Surgery-General3099
 
3.0%
Nephrology1613
 
1.6%
Orthopedics1400
 
1.4%
Orthopedics-Reconstructive1233
 
1.2%
Radiologist1140
 
1.1%
Other values (63)8340
 
8.2%

Length

2021-08-29T15:25:09.219180image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
49949
49.1%
internalmedicine14635
 
14.4%
emergency/trauma7565
 
7.4%
family/generalpractice7440
 
7.3%
cardiology5352
 
5.3%
surgery-general3099
 
3.0%
nephrology1613
 
1.6%
orthopedics1400
 
1.4%
orthopedics-reconstructive1233
 
1.2%
radiologist1140
 
1.1%
Other values (63)8340
 
8.2%

Most occurring characters

ValueCountFrequency (%)
e105151
 
12.0%
r76899
 
8.8%
a71149
 
8.1%
n68798
 
7.8%
i63308
 
7.2%
c50007
 
5.7%
?49949
 
5.7%
l48871
 
5.6%
y34937
 
4.0%
t34149
 
3.9%
Other values (34)273259
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter705846
80.5%
Uppercase Letter98148
 
11.2%
Other Punctuation65856
 
7.5%
Dash Punctuation6627
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e105151
14.9%
r76899
10.9%
a71149
10.1%
n68798
9.7%
i63308
9.0%
c50007
7.1%
l48871
6.9%
y34937
 
4.9%
t34149
 
4.8%
o34053
 
4.8%
Other values (13)118524
16.8%
Uppercase Letter
ValueCountFrequency (%)
M15055
15.3%
I14683
15.0%
G11882
12.1%
P10448
10.6%
T8332
8.5%
E7861
8.0%
F7451
7.6%
C6307
6.4%
S5156
 
5.3%
O4146
 
4.2%
Other values (7)6827
7.0%
Other Punctuation
ValueCountFrequency (%)
?49949
75.8%
/15871
 
24.1%
&36
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
-6627
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin803994
91.7%
Common72483
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e105151
13.1%
r76899
 
9.6%
a71149
 
8.8%
n68798
 
8.6%
i63308
 
7.9%
c50007
 
6.2%
l48871
 
6.1%
y34937
 
4.3%
t34149
 
4.2%
o34053
 
4.2%
Other values (30)216672
26.9%
Common
ValueCountFrequency (%)
?49949
68.9%
/15871
 
21.9%
-6627
 
9.1%
&36
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII876477
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e105151
 
12.0%
r76899
 
8.8%
a71149
 
8.1%
n68798
 
7.8%
i63308
 
7.2%
c50007
 
5.7%
?49949
 
5.7%
l48871
 
5.6%
y34937
 
4.0%
t34149
 
3.9%
Other values (34)273259
31.2%

num_lab_procedures
Real number (ℝ≥0)

Distinct118
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.09564098
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:09.449472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.67436225
Coefficient of variation (CV)0.4565278947
Kurtosis-0.2450735189
Mean43.09564098
Median Absolute Deviation (MAD)13
Skewness-0.2365439206
Sum4385671
Variance387.0805299
MonotonicityNot monotonic
2021-08-29T15:25:09.734463image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13208
 
3.2%
432804
 
2.8%
442496
 
2.5%
452376
 
2.3%
382213
 
2.2%
402201
 
2.2%
462189
 
2.2%
412117
 
2.1%
422113
 
2.1%
472106
 
2.1%
Other values (108)77943
76.6%
ValueCountFrequency (%)
13208
3.2%
21101
 
1.1%
3668
 
0.7%
4378
 
0.4%
5286
 
0.3%
6282
 
0.3%
7323
 
0.3%
8366
 
0.4%
9933
 
0.9%
10838
 
0.8%
ValueCountFrequency (%)
1321
 
< 0.1%
1291
 
< 0.1%
1261
 
< 0.1%
1211
 
< 0.1%
1201
 
< 0.1%
1181
 
< 0.1%
1142
< 0.1%
1133
< 0.1%
1113
< 0.1%
1094
< 0.1%

num_procedures
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.339730362
Minimum0
Maximum6
Zeros46652
Zeros (%)45.8%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:10.088569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.705806979
Coefficient of variation (CV)1.273246489
Kurtosis0.8571103021
Mean1.339730362
Median Absolute Deviation (MAD)1
Skewness1.316414763
Sum136339
Variance2.90977745
MonotonicityNot monotonic
2021-08-29T15:25:10.220054image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
046652
45.8%
120742
20.4%
212717
 
12.5%
39443
 
9.3%
64954
 
4.9%
44180
 
4.1%
53078
 
3.0%
ValueCountFrequency (%)
046652
45.8%
120742
20.4%
212717
 
12.5%
39443
 
9.3%
44180
 
4.1%
53078
 
3.0%
64954
 
4.9%
ValueCountFrequency (%)
64954
 
4.9%
53078
 
3.0%
44180
 
4.1%
39443
 
9.3%
212717
 
12.5%
120742
20.4%
046652
45.8%

num_medications
Real number (ℝ≥0)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.02184423
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:10.400347image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.127566209
Coefficient of variation (CV)0.5072803163
Kurtosis3.468154915
Mean16.02184423
Median Absolute Deviation (MAD)5
Skewness1.326672134
Sum1630479
Variance66.05733248
MonotonicityNot monotonic
2021-08-29T15:25:10.612824image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
136086
 
6.0%
126004
 
5.9%
115795
 
5.7%
155792
 
5.7%
145707
 
5.6%
165430
 
5.3%
105346
 
5.3%
174919
 
4.8%
94913
 
4.8%
184523
 
4.4%
Other values (65)47251
46.4%
ValueCountFrequency (%)
1262
 
0.3%
2470
 
0.5%
3900
 
0.9%
41417
 
1.4%
52017
 
2.0%
62699
2.7%
73484
3.4%
84353
4.3%
94913
4.8%
105346
5.3%
ValueCountFrequency (%)
811
 
< 0.1%
791
 
< 0.1%
752
 
< 0.1%
741
 
< 0.1%
723
< 0.1%
702
 
< 0.1%
695
< 0.1%
687
< 0.1%
677
< 0.1%
665
< 0.1%

number_outpatient
Real number (ℝ≥0)

ZEROS

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3693571527
Minimum0
Maximum42
Zeros85027
Zeros (%)83.6%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:10.929326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.267265097
Coefficient of variation (CV)3.431001911
Kurtosis147.9077363
Mean0.3693571527
Median Absolute Deviation (MAD)0
Skewness8.832958927
Sum37588
Variance1.605960825
MonotonicityNot monotonic
2021-08-29T15:25:11.143121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
085027
83.6%
18547
 
8.4%
23594
 
3.5%
32042
 
2.0%
41099
 
1.1%
5533
 
0.5%
6303
 
0.3%
7155
 
0.2%
898
 
0.1%
983
 
0.1%
Other values (29)285
 
0.3%
ValueCountFrequency (%)
085027
83.6%
18547
 
8.4%
23594
 
3.5%
32042
 
2.0%
41099
 
1.1%
5533
 
0.5%
6303
 
0.3%
7155
 
0.2%
898
 
0.1%
983
 
0.1%
ValueCountFrequency (%)
421
< 0.1%
401
< 0.1%
391
< 0.1%
381
< 0.1%
371
< 0.1%
362
< 0.1%
352
< 0.1%
341
< 0.1%
332
< 0.1%
292
< 0.1%

number_emergency
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1978362125
Minimum0
Maximum76
Zeros90383
Zeros (%)88.8%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:11.330377image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum76
Range76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9304722684
Coefficient of variation (CV)4.703245461
Kurtosis1191.686726
Mean0.1978362125
Median Absolute Deviation (MAD)0
Skewness22.85558215
Sum20133
Variance0.8657786423
MonotonicityNot monotonic
2021-08-29T15:25:11.642529image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
090383
88.8%
17677
 
7.5%
22042
 
2.0%
3725
 
0.7%
4374
 
0.4%
5192
 
0.2%
694
 
0.1%
773
 
0.1%
850
 
< 0.1%
1034
 
< 0.1%
Other values (23)122
 
0.1%
ValueCountFrequency (%)
090383
88.8%
17677
 
7.5%
22042
 
2.0%
3725
 
0.7%
4374
 
0.4%
5192
 
0.2%
694
 
0.1%
773
 
0.1%
850
 
< 0.1%
933
 
< 0.1%
ValueCountFrequency (%)
761
< 0.1%
641
< 0.1%
631
< 0.1%
541
< 0.1%
461
< 0.1%
421
< 0.1%
371
< 0.1%
291
< 0.1%
281
< 0.1%
252
< 0.1%

number_inpatient
Real number (ℝ≥0)

ZEROS

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6355659061
Minimum0
Maximum21
Zeros67630
Zeros (%)66.5%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:11.978487image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.26286329
Coefficient of variation (CV)1.986990299
Kurtosis20.71939695
Mean0.6355659061
Median Absolute Deviation (MAD)0
Skewness3.614138992
Sum64679
Variance1.594823689
MonotonicityNot monotonic
2021-08-29T15:25:12.170454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
067630
66.5%
119521
 
19.2%
27566
 
7.4%
33411
 
3.4%
41622
 
1.6%
5812
 
0.8%
6480
 
0.5%
7268
 
0.3%
8151
 
0.1%
9111
 
0.1%
Other values (11)194
 
0.2%
ValueCountFrequency (%)
067630
66.5%
119521
 
19.2%
27566
 
7.4%
33411
 
3.4%
41622
 
1.6%
5812
 
0.8%
6480
 
0.5%
7268
 
0.3%
8151
 
0.1%
9111
 
0.1%
ValueCountFrequency (%)
211
 
< 0.1%
192
 
< 0.1%
181
 
< 0.1%
171
 
< 0.1%
166
 
< 0.1%
159
 
< 0.1%
1410
 
< 0.1%
1320
< 0.1%
1234
< 0.1%
1149
< 0.1%

diag_1
Categorical

HIGH CARDINALITY

Distinct717
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
428
 
6862
414
 
6581
786
 
4016
410
 
3614
486
 
3508
Other values (712)
77185 

Length

Max length6
Median length3
Mean length3.175215691
Min length1

Characters and Unicode

Total characters323129
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)0.1%

Sample

1st row250.83
2nd row276
3rd row648
4th row8
5th row197

Common Values

ValueCountFrequency (%)
4286862
 
6.7%
4146581
 
6.5%
7864016
 
3.9%
4103614
 
3.6%
4863508
 
3.4%
4272766
 
2.7%
4912275
 
2.2%
7152151
 
2.1%
6822042
 
2.0%
4342028
 
2.0%
Other values (707)65923
64.8%

Length

2021-08-29T15:25:13.154362image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4286862
 
6.7%
4146581
 
6.5%
7864016
 
3.9%
4103614
 
3.6%
4863508
 
3.4%
4272766
 
2.7%
4912275
 
2.2%
7152151
 
2.1%
6822042
 
2.0%
4342028
 
2.0%
Other values (707)65923
64.8%

Most occurring characters

ValueCountFrequency (%)
455457
17.2%
239876
12.3%
837949
11.7%
537131
11.5%
728668
8.9%
128106
8.7%
024960
7.7%
623198
7.2%
919978
 
6.2%
317618
 
5.5%
Other values (4)10188
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number312941
96.8%
Other Punctuation8543
 
2.6%
Uppercase Letter1645
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
455457
17.7%
239876
12.7%
837949
12.1%
537131
11.9%
728668
9.2%
128106
9.0%
024960
8.0%
623198
7.4%
919978
 
6.4%
317618
 
5.6%
Other Punctuation
ValueCountFrequency (%)
.8522
99.8%
?21
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
V1644
99.9%
E1
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common321484
99.5%
Latin1645
 
0.5%

Most frequent character per script

Common
ValueCountFrequency (%)
455457
17.3%
239876
12.4%
837949
11.8%
537131
11.5%
728668
8.9%
128106
8.7%
024960
7.8%
623198
7.2%
919978
 
6.2%
317618
 
5.5%
Other values (2)8543
 
2.7%
Latin
ValueCountFrequency (%)
V1644
99.9%
E1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII323129
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
455457
17.2%
239876
12.3%
837949
11.7%
537131
11.5%
728668
8.9%
128106
8.7%
024960
7.7%
623198
7.2%
919978
 
6.2%
317618
 
5.5%
Other values (4)10188
 
3.2%

diag_2
Categorical

HIGH CARDINALITY

Distinct749
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
276
 
6752
428
 
6662
250
 
6071
427
 
5036
401
 
3736
Other values (744)
73509 

Length

Max length6
Median length3
Mean length3.166194996
Min length1

Characters and Unicode

Total characters322211
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)0.1%

Sample

1st row?
2nd row250.01
3rd row250
4th row250.43
5th row157

Common Values

ValueCountFrequency (%)
2766752
 
6.6%
4286662
 
6.5%
2506071
 
6.0%
4275036
 
4.9%
4013736
 
3.7%
4963305
 
3.2%
5993288
 
3.2%
4032823
 
2.8%
4142650
 
2.6%
4112566
 
2.5%
Other values (739)58877
57.9%

Length

2021-08-29T15:25:13.666314image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2766752
 
6.6%
4286662
 
6.5%
2506071
 
6.0%
4275036
 
4.9%
4013736
 
3.7%
4963305
 
3.2%
5993288
 
3.2%
4032823
 
2.8%
4142650
 
2.6%
4112566
 
2.5%
Other values (739)58877
57.9%

Most occurring characters

ValueCountFrequency (%)
451155
15.9%
249765
15.4%
538176
11.8%
034046
10.6%
828711
8.9%
728654
8.9%
126158
8.1%
921842
6.8%
619990
 
6.2%
314097
 
4.4%
Other values (4)9617
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number312594
97.0%
Other Punctuation7081
 
2.2%
Uppercase Letter2536
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
451155
16.4%
249765
15.9%
538176
12.2%
034046
10.9%
828711
9.2%
728654
9.2%
126158
8.4%
921842
7.0%
619990
 
6.4%
314097
 
4.5%
Other Punctuation
ValueCountFrequency (%)
.6723
94.9%
?358
 
5.1%
Uppercase Letter
ValueCountFrequency (%)
V1805
71.2%
E731
28.8%

Most occurring scripts

ValueCountFrequency (%)
Common319675
99.2%
Latin2536
 
0.8%

Most frequent character per script

Common
ValueCountFrequency (%)
451155
16.0%
249765
15.6%
538176
11.9%
034046
10.7%
828711
9.0%
728654
9.0%
126158
8.2%
921842
6.8%
619990
 
6.3%
314097
 
4.4%
Other values (2)7081
 
2.2%
Latin
ValueCountFrequency (%)
V1805
71.2%
E731
28.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII322211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
451155
15.9%
249765
15.4%
538176
11.8%
034046
10.6%
828711
8.9%
728654
8.9%
126158
8.1%
921842
6.8%
619990
 
6.2%
314097
 
4.4%
Other values (4)9617
 
3.0%

diag_3
Categorical

HIGH CARDINALITY

Distinct790
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
250
11555 
401
8289 
276
 
5175
428
 
4577
427
 
3955
Other values (785)
68215 

Length

Max length6
Median length3
Mean length3.111658118
Min length1

Characters and Unicode

Total characters316661
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)0.1%

Sample

1st row?
2nd row255
3rd rowV27
4th row403
5th row250

Common Values

ValueCountFrequency (%)
25011555
 
11.4%
4018289
 
8.1%
2765175
 
5.1%
4284577
 
4.5%
4273955
 
3.9%
4143664
 
3.6%
4962605
 
2.6%
4032357
 
2.3%
5851992
 
2.0%
2721969
 
1.9%
Other values (780)55628
54.7%

Length

2021-08-29T15:25:14.154292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25011555
 
11.4%
4018289
 
8.1%
2765175
 
5.1%
4284577
 
4.5%
4273955
 
3.9%
4143664
 
3.6%
4962605
 
2.6%
4032357
 
2.3%
5851992
 
2.0%
2721969
 
1.9%
Other values (780)55628
54.7%

Most occurring characters

ValueCountFrequency (%)
251244
16.2%
449252
15.6%
541260
13.0%
039711
12.5%
726504
8.4%
124684
7.8%
823825
7.5%
917323
 
5.5%
616441
 
5.2%
314333
 
4.5%
Other values (4)12084
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number304577
96.2%
Other Punctuation7026
 
2.2%
Uppercase Letter5058
 
1.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
251244
16.8%
449252
16.2%
541260
13.5%
039711
13.0%
726504
8.7%
124684
8.1%
823825
7.8%
917323
 
5.7%
616441
 
5.4%
314333
 
4.7%
Other Punctuation
ValueCountFrequency (%)
.5603
79.7%
?1423
 
20.3%
Uppercase Letter
ValueCountFrequency (%)
V3814
75.4%
E1244
 
24.6%

Most occurring scripts

ValueCountFrequency (%)
Common311603
98.4%
Latin5058
 
1.6%

Most frequent character per script

Common
ValueCountFrequency (%)
251244
16.4%
449252
15.8%
541260
13.2%
039711
12.7%
726504
8.5%
124684
7.9%
823825
7.6%
917323
 
5.6%
616441
 
5.3%
314333
 
4.6%
Other values (2)7026
 
2.3%
Latin
ValueCountFrequency (%)
V3814
75.4%
E1244
 
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII316661
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
251244
16.2%
449252
15.6%
541260
13.0%
039711
12.5%
726504
8.4%
124684
7.8%
823825
7.5%
917323
 
5.5%
616441
 
5.2%
314333
 
4.5%
Other values (4)12084
 
3.8%

number_diagnoses
Real number (ℝ≥0)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.422606765
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2021-08-29T15:25:14.290259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.933600145
Coefficient of variation (CV)0.2605014931
Kurtosis-0.07905602427
Mean7.422606765
Median Absolute Deviation (MAD)1
Skewness-0.8767462388
Sum755369
Variance3.738809521
MonotonicityNot monotonic
2021-08-29T15:25:14.458244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
949474
48.6%
511393
 
11.2%
810616
 
10.4%
710393
 
10.2%
610161
 
10.0%
45537
 
5.4%
32835
 
2.8%
21023
 
1.0%
1219
 
0.2%
1645
 
< 0.1%
Other values (6)70
 
0.1%
ValueCountFrequency (%)
1219
 
0.2%
21023
 
1.0%
32835
 
2.8%
45537
 
5.4%
511393
 
11.2%
610161
 
10.0%
710393
 
10.2%
810616
 
10.4%
949474
48.6%
1017
 
< 0.1%
ValueCountFrequency (%)
1645
 
< 0.1%
1510
 
< 0.1%
147
 
< 0.1%
1316
 
< 0.1%
129
 
< 0.1%
1111
 
< 0.1%
1017
 
< 0.1%
949474
48.6%
810616
 
10.4%
710393
 
10.2%

max_glu_serum
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
None
96420 
Norm
 
2597
>200
 
1485
>300
 
1264

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters407064
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None96420
94.7%
Norm2597
 
2.6%
>2001485
 
1.5%
>3001264
 
1.2%

Length

2021-08-29T15:25:14.882223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:15.034188image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none96420
94.7%
norm2597
 
2.6%
2001485
 
1.5%
3001264
 
1.2%

Most occurring characters

ValueCountFrequency (%)
N99017
24.3%
o99017
24.3%
n96420
23.7%
e96420
23.7%
05498
 
1.4%
>2749
 
0.7%
r2597
 
0.6%
m2597
 
0.6%
21485
 
0.4%
31264
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter297051
73.0%
Uppercase Letter99017
 
24.3%
Decimal Number8247
 
2.0%
Math Symbol2749
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o99017
33.3%
n96420
32.5%
e96420
32.5%
r2597
 
0.9%
m2597
 
0.9%
Decimal Number
ValueCountFrequency (%)
05498
66.7%
21485
 
18.0%
31264
 
15.3%
Uppercase Letter
ValueCountFrequency (%)
N99017
100.0%
Math Symbol
ValueCountFrequency (%)
>2749
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin396068
97.3%
Common10996
 
2.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
N99017
25.0%
o99017
25.0%
n96420
24.3%
e96420
24.3%
r2597
 
0.7%
m2597
 
0.7%
Common
ValueCountFrequency (%)
05498
50.0%
>2749
25.0%
21485
 
13.5%
31264
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII407064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N99017
24.3%
o99017
24.3%
n96420
23.7%
e96420
23.7%
05498
 
1.4%
>2749
 
0.7%
r2597
 
0.6%
m2597
 
0.6%
21485
 
0.4%
31264
 
0.3%

A1Cresult
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
None
84748 
>8
 
8216
Norm
 
4990
>7
 
3812

Length

Max length4
Median length4
Mean length3.763614567
Min length2

Characters and Unicode

Total characters383008
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone

Common Values

ValueCountFrequency (%)
None84748
83.3%
>88216
 
8.1%
Norm4990
 
4.9%
>73812
 
3.7%

Length

2021-08-29T15:25:15.402156image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:15.578144image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none84748
83.3%
88216
 
8.1%
norm4990
 
4.9%
73812
 
3.7%

Most occurring characters

ValueCountFrequency (%)
N89738
23.4%
o89738
23.4%
n84748
22.1%
e84748
22.1%
>12028
 
3.1%
88216
 
2.1%
r4990
 
1.3%
m4990
 
1.3%
73812
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter269214
70.3%
Uppercase Letter89738
 
23.4%
Math Symbol12028
 
3.1%
Decimal Number12028
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o89738
33.3%
n84748
31.5%
e84748
31.5%
r4990
 
1.9%
m4990
 
1.9%
Decimal Number
ValueCountFrequency (%)
88216
68.3%
73812
31.7%
Uppercase Letter
ValueCountFrequency (%)
N89738
100.0%
Math Symbol
ValueCountFrequency (%)
>12028
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin358952
93.7%
Common24056
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
N89738
25.0%
o89738
25.0%
n84748
23.6%
e84748
23.6%
r4990
 
1.4%
m4990
 
1.4%
Common
ValueCountFrequency (%)
>12028
50.0%
88216
34.2%
73812
 
15.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII383008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N89738
23.4%
o89738
23.4%
n84748
22.1%
e84748
22.1%
>12028
 
3.1%
88216
 
2.1%
r4990
 
1.3%
m4990
 
1.3%
73812
 
1.0%

metformin
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
81778 
Steady
18346 
Up
 
1067
Down
 
575

Length

Max length6
Median length2
Mean length2.732405715
Min length2

Characters and Unicode

Total characters278066
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No81778
80.4%
Steady18346
 
18.0%
Up1067
 
1.0%
Down575
 
0.6%

Length

2021-08-29T15:25:16.106092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:16.250100image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no81778
80.4%
steady18346
 
18.0%
up1067
 
1.0%
down575
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o82353
29.6%
N81778
29.4%
S18346
 
6.6%
t18346
 
6.6%
e18346
 
6.6%
a18346
 
6.6%
d18346
 
6.6%
y18346
 
6.6%
U1067
 
0.4%
p1067
 
0.4%
Other values (3)1725
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter176300
63.4%
Uppercase Letter101766
36.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o82353
46.7%
t18346
 
10.4%
e18346
 
10.4%
a18346
 
10.4%
d18346
 
10.4%
y18346
 
10.4%
p1067
 
0.6%
w575
 
0.3%
n575
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
N81778
80.4%
S18346
 
18.0%
U1067
 
1.0%
D575
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin278066
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o82353
29.6%
N81778
29.4%
S18346
 
6.6%
t18346
 
6.6%
e18346
 
6.6%
a18346
 
6.6%
d18346
 
6.6%
y18346
 
6.6%
U1067
 
0.4%
p1067
 
0.4%
Other values (3)1725
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII278066
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o82353
29.6%
N81778
29.4%
S18346
 
6.6%
t18346
 
6.6%
e18346
 
6.6%
a18346
 
6.6%
d18346
 
6.6%
y18346
 
6.6%
U1067
 
0.4%
p1067
 
0.4%
Other values (3)1725
 
0.6%

repaglinide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
100227 
Steady
 
1384
Up
 
110
Down
 
45

Length

Max length6
Median length2
Mean length2.05528369
Min length2

Characters and Unicode

Total characters209158
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No100227
98.5%
Steady1384
 
1.4%
Up110
 
0.1%
Down45
 
< 0.1%

Length

2021-08-29T15:25:16.594051image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:16.714035image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no100227
98.5%
steady1384
 
1.4%
up110
 
0.1%
down45
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o100272
47.9%
N100227
47.9%
S1384
 
0.7%
t1384
 
0.7%
e1384
 
0.7%
a1384
 
0.7%
d1384
 
0.7%
y1384
 
0.7%
U110
 
0.1%
p110
 
0.1%
Other values (3)135
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter107392
51.3%
Uppercase Letter101766
48.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o100272
93.4%
t1384
 
1.3%
e1384
 
1.3%
a1384
 
1.3%
d1384
 
1.3%
y1384
 
1.3%
p110
 
0.1%
w45
 
< 0.1%
n45
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N100227
98.5%
S1384
 
1.4%
U110
 
0.1%
D45
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin209158
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o100272
47.9%
N100227
47.9%
S1384
 
0.7%
t1384
 
0.7%
e1384
 
0.7%
a1384
 
0.7%
d1384
 
0.7%
y1384
 
0.7%
U110
 
0.1%
p110
 
0.1%
Other values (3)135
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII209158
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o100272
47.9%
N100227
47.9%
S1384
 
0.7%
t1384
 
0.7%
e1384
 
0.7%
a1384
 
0.7%
d1384
 
0.7%
y1384
 
0.7%
U110
 
0.1%
p110
 
0.1%
Other values (3)135
 
0.1%

nateglinide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101063 
Steady
 
668
Up
 
24
Down
 
11

Length

Max length6
Median length2
Mean length2.026472496
Min length2

Characters and Unicode

Total characters206226
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101063
99.3%
Steady668
 
0.7%
Up24
 
< 0.1%
Down11
 
< 0.1%

Length

2021-08-29T15:25:17.114001image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:17.290002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101063
99.3%
steady668
 
0.7%
up24
 
< 0.1%
down11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o101074
49.0%
N101063
49.0%
S668
 
0.3%
t668
 
0.3%
e668
 
0.3%
a668
 
0.3%
d668
 
0.3%
y668
 
0.3%
U24
 
< 0.1%
p24
 
< 0.1%
Other values (3)33
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter104460
50.7%
Uppercase Letter101766
49.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101074
96.8%
t668
 
0.6%
e668
 
0.6%
a668
 
0.6%
d668
 
0.6%
y668
 
0.6%
p24
 
< 0.1%
w11
 
< 0.1%
n11
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101063
99.3%
S668
 
0.7%
U24
 
< 0.1%
D11
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin206226
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o101074
49.0%
N101063
49.0%
S668
 
0.3%
t668
 
0.3%
e668
 
0.3%
a668
 
0.3%
d668
 
0.3%
y668
 
0.3%
U24
 
< 0.1%
p24
 
< 0.1%
Other values (3)33
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII206226
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o101074
49.0%
N101063
49.0%
S668
 
0.3%
t668
 
0.3%
e668
 
0.3%
a668
 
0.3%
d668
 
0.3%
y668
 
0.3%
U24
 
< 0.1%
p24
 
< 0.1%
Other values (3)33
 
< 0.1%

chlorpropamide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101680 
Steady
 
79
Up
 
6
Down
 
1

Length

Max length6
Median length2
Mean length2.003124816
Min length2

Characters and Unicode

Total characters203850
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101680
99.9%
Steady79
 
0.1%
Up6
 
< 0.1%
Down1
 
< 0.1%

Length

2021-08-29T15:25:17.657947image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:17.785958image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101680
99.9%
steady79
 
0.1%
up6
 
< 0.1%
down1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o101681
49.9%
N101680
49.9%
S79
 
< 0.1%
t79
 
< 0.1%
e79
 
< 0.1%
a79
 
< 0.1%
d79
 
< 0.1%
y79
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter102084
50.1%
Uppercase Letter101766
49.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101681
99.6%
t79
 
0.1%
e79
 
0.1%
a79
 
0.1%
d79
 
0.1%
y79
 
0.1%
p6
 
< 0.1%
w1
 
< 0.1%
n1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101680
99.9%
S79
 
0.1%
U6
 
< 0.1%
D1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203850
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o101681
49.9%
N101680
49.9%
S79
 
< 0.1%
t79
 
< 0.1%
e79
 
< 0.1%
a79
 
< 0.1%
d79
 
< 0.1%
y79
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203850
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o101681
49.9%
N101680
49.9%
S79
 
< 0.1%
t79
 
< 0.1%
e79
 
< 0.1%
a79
 
< 0.1%
d79
 
< 0.1%
y79
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

glimepiride
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
96575 
Steady
 
4670
Up
 
327
Down
 
194

Length

Max length6
Median length2
Mean length2.187371028
Min length2

Characters and Unicode

Total characters222600
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No96575
94.9%
Steady4670
 
4.6%
Up327
 
0.3%
Down194
 
0.2%

Length

2021-08-29T15:25:18.089929image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:18.217900image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no96575
94.9%
steady4670
 
4.6%
up327
 
0.3%
down194
 
0.2%

Most occurring characters

ValueCountFrequency (%)
o96769
43.5%
N96575
43.4%
S4670
 
2.1%
t4670
 
2.1%
e4670
 
2.1%
a4670
 
2.1%
d4670
 
2.1%
y4670
 
2.1%
U327
 
0.1%
p327
 
0.1%
Other values (3)582
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter120834
54.3%
Uppercase Letter101766
45.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o96769
80.1%
t4670
 
3.9%
e4670
 
3.9%
a4670
 
3.9%
d4670
 
3.9%
y4670
 
3.9%
p327
 
0.3%
w194
 
0.2%
n194
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
N96575
94.9%
S4670
 
4.6%
U327
 
0.3%
D194
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin222600
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o96769
43.5%
N96575
43.4%
S4670
 
2.1%
t4670
 
2.1%
e4670
 
2.1%
a4670
 
2.1%
d4670
 
2.1%
y4670
 
2.1%
U327
 
0.1%
p327
 
0.1%
Other values (3)582
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII222600
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o96769
43.5%
N96575
43.4%
S4670
 
2.1%
t4670
 
2.1%
e4670
 
2.1%
a4670
 
2.1%
d4670
 
2.1%
y4670
 
2.1%
U327
 
0.1%
p327
 
0.1%
Other values (3)582
 
0.3%

acetohexamide
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000039306
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101765
> 99.9%
Steady1
 
< 0.1%

Length

2021-08-29T15:25:18.561885image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:18.729849image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101765
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101770
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101765
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101765
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

glipizide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
89080 
Steady
11356 
Up
 
770
Down
 
560

Length

Max length6
Median length2
Mean length2.45736297
Min length2

Characters and Unicode

Total characters250076
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowSteady
4th rowNo
5th rowSteady

Common Values

ValueCountFrequency (%)
No89080
87.5%
Steady11356
 
11.2%
Up770
 
0.8%
Down560
 
0.6%

Length

2021-08-29T15:25:19.137831image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:19.297798image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no89080
87.5%
steady11356
 
11.2%
up770
 
0.8%
down560
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o89640
35.8%
N89080
35.6%
S11356
 
4.5%
t11356
 
4.5%
e11356
 
4.5%
a11356
 
4.5%
d11356
 
4.5%
y11356
 
4.5%
U770
 
0.3%
p770
 
0.3%
Other values (3)1680
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter148310
59.3%
Uppercase Letter101766
40.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o89640
60.4%
t11356
 
7.7%
e11356
 
7.7%
a11356
 
7.7%
d11356
 
7.7%
y11356
 
7.7%
p770
 
0.5%
w560
 
0.4%
n560
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
N89080
87.5%
S11356
 
11.2%
U770
 
0.8%
D560
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin250076
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o89640
35.8%
N89080
35.6%
S11356
 
4.5%
t11356
 
4.5%
e11356
 
4.5%
a11356
 
4.5%
d11356
 
4.5%
y11356
 
4.5%
U770
 
0.3%
p770
 
0.3%
Other values (3)1680
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII250076
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o89640
35.8%
N89080
35.6%
S11356
 
4.5%
t11356
 
4.5%
e11356
 
4.5%
a11356
 
4.5%
d11356
 
4.5%
y11356
 
4.5%
U770
 
0.3%
p770
 
0.3%
Other values (3)1680
 
0.7%

glyburide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
91116 
Steady
9274 
Up
 
812
Down
 
564

Length

Max length6
Median length2
Mean length2.375606784
Min length2

Characters and Unicode

Total characters241756
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No91116
89.5%
Steady9274
 
9.1%
Up812
 
0.8%
Down564
 
0.6%

Length

2021-08-29T15:25:19.689760image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:19.825776image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no91116
89.5%
steady9274
 
9.1%
up812
 
0.8%
down564
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o91680
37.9%
N91116
37.7%
S9274
 
3.8%
t9274
 
3.8%
e9274
 
3.8%
a9274
 
3.8%
d9274
 
3.8%
y9274
 
3.8%
U812
 
0.3%
p812
 
0.3%
Other values (3)1692
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter139990
57.9%
Uppercase Letter101766
42.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o91680
65.5%
t9274
 
6.6%
e9274
 
6.6%
a9274
 
6.6%
d9274
 
6.6%
y9274
 
6.6%
p812
 
0.6%
w564
 
0.4%
n564
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
N91116
89.5%
S9274
 
9.1%
U812
 
0.8%
D564
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin241756
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o91680
37.9%
N91116
37.7%
S9274
 
3.8%
t9274
 
3.8%
e9274
 
3.8%
a9274
 
3.8%
d9274
 
3.8%
y9274
 
3.8%
U812
 
0.3%
p812
 
0.3%
Other values (3)1692
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII241756
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o91680
37.9%
N91116
37.7%
S9274
 
3.8%
t9274
 
3.8%
e9274
 
3.8%
a9274
 
3.8%
d9274
 
3.8%
y9274
 
3.8%
U812
 
0.3%
p812
 
0.3%
Other values (3)1692
 
0.7%

tolbutamide
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101743 
Steady
 
23

Length

Max length6
Median length2
Mean length2.000904035
Min length2

Characters and Unicode

Total characters203624
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101743
> 99.9%
Steady23
 
< 0.1%

Length

2021-08-29T15:25:20.185738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:20.321727image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101743
> 99.9%
steady23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101743
50.0%
o101743
50.0%
S23
 
< 0.1%
t23
 
< 0.1%
e23
 
< 0.1%
a23
 
< 0.1%
d23
 
< 0.1%
y23
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101858
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101743
99.9%
t23
 
< 0.1%
e23
 
< 0.1%
a23
 
< 0.1%
d23
 
< 0.1%
y23
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101743
> 99.9%
S23
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203624
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101743
50.0%
o101743
50.0%
S23
 
< 0.1%
t23
 
< 0.1%
e23
 
< 0.1%
a23
 
< 0.1%
d23
 
< 0.1%
y23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101743
50.0%
o101743
50.0%
S23
 
< 0.1%
t23
 
< 0.1%
e23
 
< 0.1%
a23
 
< 0.1%
d23
 
< 0.1%
y23
 
< 0.1%

pioglitazone
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
94438 
Steady
 
6976
Up
 
234
Down
 
118

Length

Max length6
Median length2
Mean length2.276516715
Min length2

Characters and Unicode

Total characters231672
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No94438
92.8%
Steady6976
 
6.9%
Up234
 
0.2%
Down118
 
0.1%

Length

2021-08-29T15:25:20.769661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:20.937645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no94438
92.8%
steady6976
 
6.9%
up234
 
0.2%
down118
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o94556
40.8%
N94438
40.8%
S6976
 
3.0%
t6976
 
3.0%
e6976
 
3.0%
a6976
 
3.0%
d6976
 
3.0%
y6976
 
3.0%
U234
 
0.1%
p234
 
0.1%
Other values (3)354
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter129906
56.1%
Uppercase Letter101766
43.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o94556
72.8%
t6976
 
5.4%
e6976
 
5.4%
a6976
 
5.4%
d6976
 
5.4%
y6976
 
5.4%
p234
 
0.2%
w118
 
0.1%
n118
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N94438
92.8%
S6976
 
6.9%
U234
 
0.2%
D118
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin231672
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o94556
40.8%
N94438
40.8%
S6976
 
3.0%
t6976
 
3.0%
e6976
 
3.0%
a6976
 
3.0%
d6976
 
3.0%
y6976
 
3.0%
U234
 
0.1%
p234
 
0.1%
Other values (3)354
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII231672
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o94556
40.8%
N94438
40.8%
S6976
 
3.0%
t6976
 
3.0%
e6976
 
3.0%
a6976
 
3.0%
d6976
 
3.0%
y6976
 
3.0%
U234
 
0.1%
p234
 
0.1%
Other values (3)354
 
0.2%

rosiglitazone
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
95401 
Steady
 
6100
Up
 
178
Down
 
87

Length

Max length6
Median length2
Mean length2.241475542
Min length2

Characters and Unicode

Total characters228106
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No95401
93.7%
Steady6100
 
6.0%
Up178
 
0.2%
Down87
 
0.1%

Length

2021-08-29T15:25:21.269907image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:21.385437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no95401
93.7%
steady6100
 
6.0%
up178
 
0.2%
down87
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o95488
41.9%
N95401
41.8%
S6100
 
2.7%
t6100
 
2.7%
e6100
 
2.7%
a6100
 
2.7%
d6100
 
2.7%
y6100
 
2.7%
U178
 
0.1%
p178
 
0.1%
Other values (3)261
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter126340
55.4%
Uppercase Letter101766
44.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o95488
75.6%
t6100
 
4.8%
e6100
 
4.8%
a6100
 
4.8%
d6100
 
4.8%
y6100
 
4.8%
p178
 
0.1%
w87
 
0.1%
n87
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N95401
93.7%
S6100
 
6.0%
U178
 
0.2%
D87
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin228106
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o95488
41.9%
N95401
41.8%
S6100
 
2.7%
t6100
 
2.7%
e6100
 
2.7%
a6100
 
2.7%
d6100
 
2.7%
y6100
 
2.7%
U178
 
0.1%
p178
 
0.1%
Other values (3)261
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII228106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o95488
41.9%
N95401
41.8%
S6100
 
2.7%
t6100
 
2.7%
e6100
 
2.7%
a6100
 
2.7%
d6100
 
2.7%
y6100
 
2.7%
U178
 
0.1%
p178
 
0.1%
Other values (3)261
 
0.1%

acarbose
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101458 
Steady
 
295
Up
 
10
Down
 
3

Length

Max length6
Median length2
Mean length2.011654187
Min length2

Characters and Unicode

Total characters204718
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101458
99.7%
Steady295
 
0.3%
Up10
 
< 0.1%
Down3
 
< 0.1%

Length

2021-08-29T15:25:21.684427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:21.916647image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101458
99.7%
steady295
 
0.3%
up10
 
< 0.1%
down3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o101461
49.6%
N101458
49.6%
S295
 
0.1%
t295
 
0.1%
e295
 
0.1%
a295
 
0.1%
d295
 
0.1%
y295
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter102952
50.3%
Uppercase Letter101766
49.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101461
98.6%
t295
 
0.3%
e295
 
0.3%
a295
 
0.3%
d295
 
0.3%
y295
 
0.3%
p10
 
< 0.1%
w3
 
< 0.1%
n3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101458
99.7%
S295
 
0.3%
U10
 
< 0.1%
D3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin204718
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o101461
49.6%
N101458
49.6%
S295
 
0.1%
t295
 
0.1%
e295
 
0.1%
a295
 
0.1%
d295
 
0.1%
y295
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII204718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o101461
49.6%
N101458
49.6%
S295
 
0.1%
t295
 
0.1%
e295
 
0.1%
a295
 
0.1%
d295
 
0.1%
y295
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

miglitol
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101728 
Steady
 
31
Down
 
5
Up
 
2

Length

Max length6
Median length2
Mean length2.001316746
Min length2

Characters and Unicode

Total characters203666
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101728
> 99.9%
Steady31
 
< 0.1%
Down5
 
< 0.1%
Up2
 
< 0.1%

Length

2021-08-29T15:25:22.383027image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:22.615002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101728
> 99.9%
steady31
 
< 0.1%
down5
 
< 0.1%
up2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o101733
50.0%
N101728
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D5
 
< 0.1%
w5
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101900
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101733
99.8%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
w5
 
< 0.1%
n5
 
< 0.1%
p2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101728
> 99.9%
S31
 
< 0.1%
D5
 
< 0.1%
U2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203666
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o101733
50.0%
N101728
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D5
 
< 0.1%
w5
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203666
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o101733
50.0%
N101728
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D5
 
< 0.1%
w5
 
< 0.1%
Other values (3)9
 
< 0.1%

troglitazone
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101763 
Steady
 
3

Length

Max length6
Median length2
Mean length2.000117918
Min length2

Characters and Unicode

Total characters203544
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101763
> 99.9%
Steady3
 
< 0.1%

Length

2021-08-29T15:25:23.030964image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:23.150974image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101763
> 99.9%
steady3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101763
50.0%
o101763
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101778
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101763
> 99.9%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101763
> 99.9%
S3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203544
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101763
50.0%
o101763
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101763
50.0%
o101763
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

tolazamide
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101727 
Steady
 
38
Up
 
1

Length

Max length6
Median length2
Mean length2.001493623
Min length2

Characters and Unicode

Total characters203684
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101727
> 99.9%
Steady38
 
< 0.1%
Up1
 
< 0.1%

Length

2021-08-29T15:25:23.461945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:23.577825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101727
> 99.9%
steady38
 
< 0.1%
up1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101727
49.9%
o101727
49.9%
S38
 
< 0.1%
t38
 
< 0.1%
e38
 
< 0.1%
a38
 
< 0.1%
d38
 
< 0.1%
y38
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101918
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101727
99.8%
t38
 
< 0.1%
e38
 
< 0.1%
a38
 
< 0.1%
d38
 
< 0.1%
y38
 
< 0.1%
p1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101727
> 99.9%
S38
 
< 0.1%
U1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203684
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101727
49.9%
o101727
49.9%
S38
 
< 0.1%
t38
 
< 0.1%
e38
 
< 0.1%
a38
 
< 0.1%
d38
 
< 0.1%
y38
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101727
49.9%
o101727
49.9%
S38
 
< 0.1%
t38
 
< 0.1%
e38
 
< 0.1%
a38
 
< 0.1%
d38
 
< 0.1%
y38
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

examide
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
False
101766 
ValueCountFrequency (%)
False101766
100.0%
2021-08-29T15:25:23.631246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

citoglipton
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
False
101766 
ValueCountFrequency (%)
False101766
100.0%
2021-08-29T15:25:23.692975image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

insulin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
47383 
Steady
30849 
Down
12218 
Up
11316 

Length

Max length6
Median length2
Mean length3.45266592
Min length2

Characters and Unicode

Total characters351364
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowUp
3rd rowNo
4th rowUp
5th rowSteady

Common Values

ValueCountFrequency (%)
No47383
46.6%
Steady30849
30.3%
Down12218
 
12.0%
Up11316
 
11.1%

Length

2021-08-29T15:25:24.014861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:24.178441image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no47383
46.6%
steady30849
30.3%
down12218
 
12.0%
up11316
 
11.1%

Most occurring characters

ValueCountFrequency (%)
o59601
17.0%
N47383
13.5%
S30849
8.8%
t30849
8.8%
e30849
8.8%
a30849
8.8%
d30849
8.8%
y30849
8.8%
D12218
 
3.5%
w12218
 
3.5%
Other values (3)34850
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter249598
71.0%
Uppercase Letter101766
29.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o59601
23.9%
t30849
12.4%
e30849
12.4%
a30849
12.4%
d30849
12.4%
y30849
12.4%
w12218
 
4.9%
n12218
 
4.9%
p11316
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
N47383
46.6%
S30849
30.3%
D12218
 
12.0%
U11316
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Latin351364
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o59601
17.0%
N47383
13.5%
S30849
8.8%
t30849
8.8%
e30849
8.8%
a30849
8.8%
d30849
8.8%
y30849
8.8%
D12218
 
3.5%
w12218
 
3.5%
Other values (3)34850
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII351364
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o59601
17.0%
N47383
13.5%
S30849
8.8%
t30849
8.8%
e30849
8.8%
a30849
8.8%
d30849
8.8%
y30849
8.8%
D12218
 
3.5%
w12218
 
3.5%
Other values (3)34850
9.9%

glyburide-metformin
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101060 
Steady
 
692
Up
 
8
Down
 
6

Length

Max length6
Median length2
Mean length2.027317572
Min length2

Characters and Unicode

Total characters206312
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101060
99.3%
Steady692
 
0.7%
Up8
 
< 0.1%
Down6
 
< 0.1%

Length

2021-08-29T15:25:24.494839image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:24.687705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101060
99.3%
steady692
 
0.7%
up8
 
< 0.1%
down6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o101066
49.0%
N101060
49.0%
S692
 
0.3%
t692
 
0.3%
e692
 
0.3%
a692
 
0.3%
d692
 
0.3%
y692
 
0.3%
U8
 
< 0.1%
p8
 
< 0.1%
Other values (3)18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter104546
50.7%
Uppercase Letter101766
49.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101066
96.7%
t692
 
0.7%
e692
 
0.7%
a692
 
0.7%
d692
 
0.7%
y692
 
0.7%
p8
 
< 0.1%
w6
 
< 0.1%
n6
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101060
99.3%
S692
 
0.7%
U8
 
< 0.1%
D6
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin206312
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o101066
49.0%
N101060
49.0%
S692
 
0.3%
t692
 
0.3%
e692
 
0.3%
a692
 
0.3%
d692
 
0.3%
y692
 
0.3%
U8
 
< 0.1%
p8
 
< 0.1%
Other values (3)18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII206312
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o101066
49.0%
N101060
49.0%
S692
 
0.3%
t692
 
0.3%
e692
 
0.3%
a692
 
0.3%
d692
 
0.3%
y692
 
0.3%
U8
 
< 0.1%
p8
 
< 0.1%
Other values (3)18
 
< 0.1%

glipizide-metformin
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101753 
Steady
 
13

Length

Max length6
Median length2
Mean length2.000510976
Min length2

Characters and Unicode

Total characters203584
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101753
> 99.9%
Steady13
 
< 0.1%

Length

2021-08-29T15:25:25.082864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:25.274974image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101753
> 99.9%
steady13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101753
50.0%
o101753
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101818
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101753
99.9%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101753
> 99.9%
S13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203584
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101753
50.0%
o101753
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101753
50.0%
o101753
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

glimepiride-pioglitazone
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000039306
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101765
> 99.9%
Steady1
 
< 0.1%

Length

2021-08-29T15:25:25.674853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:25.826956image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101765
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101770
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101765
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101765
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

metformin-rosiglitazone
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101764 
Steady
 
2

Length

Max length6
Median length2
Mean length2.000078612
Min length2

Characters and Unicode

Total characters203540
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101764
> 99.9%
Steady2
 
< 0.1%

Length

2021-08-29T15:25:26.171175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:27.055443image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101764
> 99.9%
steady2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101764
50.0%
o101764
50.0%
S2
 
< 0.1%
t2
 
< 0.1%
e2
 
< 0.1%
a2
 
< 0.1%
d2
 
< 0.1%
y2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101774
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101764
> 99.9%
t2
 
< 0.1%
e2
 
< 0.1%
a2
 
< 0.1%
d2
 
< 0.1%
y2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101764
> 99.9%
S2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203540
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101764
50.0%
o101764
50.0%
S2
 
< 0.1%
t2
 
< 0.1%
e2
 
< 0.1%
a2
 
< 0.1%
d2
 
< 0.1%
y2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203540
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101764
50.0%
o101764
50.0%
S2
 
< 0.1%
t2
 
< 0.1%
e2
 
< 0.1%
a2
 
< 0.1%
d2
 
< 0.1%
y2
 
< 0.1%

metformin-pioglitazone
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000039306
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No101765
> 99.9%
Steady1
 
< 0.1%

Length

2021-08-29T15:25:27.532072image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:27.651276image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no101765
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter101770
50.0%
Uppercase Letter101766
50.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o101765
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N101765
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin203536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N101765
50.0%
o101765
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

change
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
No
54755 
Ch
47011 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters203532
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowCh
3rd rowNo
4th rowCh
5th rowCh

Common Values

ValueCountFrequency (%)
No54755
53.8%
Ch47011
46.2%

Length

2021-08-29T15:25:27.943275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:28.047262image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no54755
53.8%
ch47011
46.2%

Most occurring characters

ValueCountFrequency (%)
N54755
26.9%
o54755
26.9%
C47011
23.1%
h47011
23.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter101766
50.0%
Lowercase Letter101766
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N54755
53.8%
C47011
46.2%
Lowercase Letter
ValueCountFrequency (%)
o54755
53.8%
h47011
46.2%

Most occurring scripts

ValueCountFrequency (%)
Latin203532
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N54755
26.9%
o54755
26.9%
C47011
23.1%
h47011
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII203532
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N54755
26.9%
o54755
26.9%
C47011
23.1%
h47011
23.1%

diabetesMed
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
True
78363 
False
23403 
ValueCountFrequency (%)
True78363
77.0%
False23403
 
23.0%
2021-08-29T15:25:28.094164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

readmitted
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size795.2 KiB
NO
54864 
>30
35545 
<30
11357 

Length

Max length3
Median length2
Mean length2.460880844
Min length2

Characters and Unicode

Total characters250434
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd row>30
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO54864
53.9%
>3035545
34.9%
<3011357
 
11.2%

Length

2021-08-29T15:25:28.341619image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:25:28.441922image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no54864
53.9%
3046902
46.1%

Most occurring characters

ValueCountFrequency (%)
N54864
21.9%
O54864
21.9%
346902
18.7%
046902
18.7%
>35545
14.2%
<11357
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter109728
43.8%
Decimal Number93804
37.5%
Math Symbol46902
18.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N54864
50.0%
O54864
50.0%
Math Symbol
ValueCountFrequency (%)
>35545
75.8%
<11357
 
24.2%
Decimal Number
ValueCountFrequency (%)
346902
50.0%
046902
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common140706
56.2%
Latin109728
43.8%

Most frequent character per script

Common
ValueCountFrequency (%)
346902
33.3%
046902
33.3%
>35545
25.3%
<11357
 
8.1%
Latin
ValueCountFrequency (%)
N54864
50.0%
O54864
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII250434
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N54864
21.9%
O54864
21.9%
346902
18.7%
046902
18.7%
>35545
14.2%
<11357
 
4.5%

Interactions

2021-08-29T15:24:14.126706image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:14.640197image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:14.856172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:15.112150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:15.328129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:15.543572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:15.769682image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:16.009678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:16.217661image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:16.433638image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:16.633623image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:16.833605image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:17.025588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:17.225548image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:17.505543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:17.753526image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:17.985499image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:18.225455image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:18.489457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:18.777423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:18.985408image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:19.329354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:19.569354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:19.777337image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:20.025312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:20.273286image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:20.585266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:20.849233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:21.081215image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:21.313193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:21.513457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:21.760914image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:21.977162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:22.193373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:22.435453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:22.667426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:22.891406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:23.099386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:23.315367image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:23.571340image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:23.779303image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:24.059276image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:24.299281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:24.515256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:24.715218image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:24.947216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:25.171196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:25.371159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:25.827132image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:26.019096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:26.251092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:26.467077image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:26.707032image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:26.923034image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:27.195015image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:27.442983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:27.706941image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:27.986941image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:28.306891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:28.674861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:28.970846image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:29.170807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:29.386806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:29.618787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:29.818768image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:30.026729image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:30.234710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:30.546680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:30.818676image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:31.010664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:31.266614image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:31.490600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:31.754595image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:32.026567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:32.298910image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:32.514871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:32.818847image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:33.098818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:33.330798image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:33.539213image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:33.871360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:34.341508image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:34.605496image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:34.821876image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:35.077830image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:35.341807image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:35.653778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:35.885756image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:36.085738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:36.293740image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:36.501699image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:36.797673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:37.061649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:37.421615image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:37.661613image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:37.917569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:38.205542image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:38.477517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:38.685501image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:39.045466image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:39.309442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:39.525424image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:39.757400image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:39.997378image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:40.213380image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:40.509637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:40.789610image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:41.013612image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:41.245569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:41.469550image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:41.877531image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:42.197483image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:42.405462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:42.653440image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:42.861420image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:43.101305image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:43.341892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:43.629710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:43.821691image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:44.069669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:44.325644image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:44.861622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:45.269563image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:45.629537image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:45.821507image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:46.021490image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:46.341460image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:46.541442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:46.869411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:47.157406image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:47.397381image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:47.669360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:47.893341image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:48.125317image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:48.349276image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:48.573255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:48.797235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:49.037235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:49.413177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:49.701177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:49.893154image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:50.109115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:50.469103image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:50.845068image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:51.037049image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:51.301025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:51.516987image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:51.740962image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:51.956944image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:52.172945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:52.388907image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:52.608818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:52.912844image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:53.161392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:53.436928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:53.666945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:53.877923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:54.387891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:54.891861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:55.130105image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:55.366114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:55.590093image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:55.835949image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:56.040134image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:56.278771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:56.534766image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:56.767088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:57.065904image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-08-29T15:24:57.447082image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-08-29T15:25:28.579969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-29T15:25:28.896394image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-29T15:25:29.250584image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-29T15:25:29.667256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-29T15:25:30.415987image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-29T15:24:58.853774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-29T15:25:01.910234image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtynum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
022783928222157CaucasianFemale[0-10)?62511?Pediatrics-Endocrinology4101000250.83??1NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNO
114919055629189CaucasianFemale[10-20)?1173??59018000276250.012559NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYes>30
26441086047875AfricanAmericanFemale[20-30)?1172??11513201648250V276NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYesNO
350036482442376CaucasianMale[30-40)?1172??441160008250.434037NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
41668042519267CaucasianMale[40-50)?1171??51080001971572505NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
53575482637451CaucasianMale[50-60)?2123??316160004144112509NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
65584284259809CaucasianMale[60-70)?3124??70121000414411V457NoneNoneSteadyNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
763768114882984CaucasianMale[70-80)?1175??730120004284922508NoneNoneNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYes>30
81252248330783CaucasianFemale[80-90)?21413??68228000398427388NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
91573863555939CaucasianFemale[90-100)?33412?InternalMedicine333180004341984868NoneNoneNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO

Last rows

encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtynum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
101756443842070140199494OtherFemale[60-70)?1172MD?466171119965854039NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
101757443842136181593374CaucasianFemale[70-80)?1175??211160014915185119NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
101758443842340120975314CaucasianFemale[80-90)?1175MC?7612201029283049NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
10175944384277886472243CaucasianMale[80-90)?1171MC?10153004357842507NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
10176044384717650375628AfricanAmericanFemale[60-70)?1176DM?451253123454384129NoneNoneNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
101761443847548100162476AfricanAmericanMale[70-80)?1373MC?51016000250.132914589None>8SteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
10176244384778274694222AfricanAmericanFemale[80-90)?1455MC?333180015602767879NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
10176344385414841088789CaucasianMale[70-80)?1171MC?53091003859029613NoneNoneSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYesNO
10176444385716631693671CaucasianFemale[80-90)?23710MCSurgery-General452210019962859989NoneNoneNoNoNoNoNoNoSteadyNoNoSteadyNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
101765443867222175429310CaucasianMale[70-80)?1176??13330005305307879NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNO